Viewpoint Paper: Identifying Patient Smoking Status from Medical Discharge Records

نویسندگان

  • Özlem Uzuner
  • Ira Goldstein
  • Yuan Luo
  • Isaac S. Kohane
چکیده

The authors organized a Natural Language Processing (NLP) challenge on automatically determining the smoking status of patients from information found in their discharge records. This challenge was issued as a part of the i2b2 (Informatics for Integrating Biology to the Bedside) project, to survey, facilitate, and examine studies in medical language understanding for clinical narratives. This article describes the smoking challenge, details the data and the annotation process, explains the evaluation metrics, discusses the characteristics of the systems developed for the challenge, presents an analysis of the results of received system runs, draws conclusions about the state of the art, and identifies directions for future research. A total of 11 teams participated in the smoking challenge. Each team submitted up to three system runs, providing a total of 23 submissions. The submitted system runs were evaluated with microaveraged and macroaveraged precision, recall, and F-measure. The systems submitted to the smoking challenge represented a variety of machine learning and rule-based algorithms. Despite the differences in their approaches to smoking status identification, many of these systems provided good results. There were 12 system runs with microaveraged F-measures above 0.84. Analysis of the results highlighted the fact that discharge summaries express smoking status using a limited number of textual features (e.g., "smok", "tobac", "cigar", Social History, etc.). Many of the effective smoking status identifiers benefit from these features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Extraction of Semantic Content from Medical Discharge Records

Semi-structured medical texts like discharge summaries are rich sources of information that can exploit the research results of physicians by performing statistical analysis of similar cases. In this paper we introduce a system based on Machine Learning algorithms that successfully classifies discharge records according to the smoking status of the patient (we distinguish between current smoker...

متن کامل

Identifying Smoking Status From Implicit Information in Medical Discharge Summaries

Human annotators and natural language applications are able to identify smoking status from discharge summaries with high accuracy when explicit evidence regarding their smoking status is present in the summary. We explore the possibility of identifying the smoking status from discharge summaries when these smoking terms have been removed. We present results using a Näıve Bayes classifier on a ...

متن کامل

Technical Brief: Using Implicit Information to Identify Smoking Status in Smoke-blind Medical Discharge Summaries

As part of the 2006 i2b2 NLP Shared Task, we explored two methods for determining the smoking status of patients from their hospital discharge summaries when explicit smoking terms were present and when those same terms were removed. We developed a simple keyword-based classifier to determine smoking status from de-identified hospital discharge summaries. We then developed a Naïve Bayes classif...

متن کامل

WICENTOWSKI AND SYDES, Using Implicit Information to Identify Smoking Status in Smoke-Blind Discharge Summaries Technical Brief _ Using Implicit Information to Identify Smoking Status in Smoke-Blind Medical Discharge Summaries

A b s t r a c t As part of the 2006 i2b2 NLP Shared Task, we explored two methods for determining the smoking status of patients from their hospital discharge summaries when explicit smoking terms were present and when those same terms were removed. We developed a simple keyword-based classifier to determine smoking status from de-identified hospital discharge summaries. We then developed a Naï...

متن کامل

Is smoking status routinely recorded when patients register with a new GP?

BACKGROUND the process of registering new patients in primary care provides an ideal opportunity to assess their smoking status systematically and record this in electronic medical records; this identification then allows smokers to be targeted with effective cessation interventions. OBJECTIVE to use a dataset of electronic primary care medical records to assess the extent to which primary ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of the American Medical Informatics Association : JAMIA

دوره 15 1  شماره 

صفحات  -

تاریخ انتشار 2008